Search CORE

21 research outputs found

A PARTAN-Accelerated Frank-Wolfe Algorithm for Large-Scale SVM Classification

Author: Frandi Emanuele
Nanculef Ricardo
Suykens Johan A. K.
Publication venue
Publication date: 05/02/2015
Field of study

Frank-Wolfe algorithms have recently regained the attention of the Machine Learning community. Their solid theoretical properties and sparsity guarantees make them a suitable choice for a wide range of problems in this field. In addition, several variants of the basic procedure exist that improve its theoretical properties and practical performance. In this paper, we investigate the application of some of these techniques to Machine Learning, focusing in particular on a Parallel Tangent (PARTAN) variant of the FW algorithm that has not been previously suggested or studied for this type of problems. We provide experiments both in a standard setting and using a stochastic speed-up technique, showing that the considered algorithms obtain promising results on several medium and large-scale benchmark datasets for SVM classification

arXiv.org e-Print Archive

Crossref

Training Support Vector Machines Using Frank-Wolfe Optimization Methods

Author: Frandi Emanuele
Gasparo Maria Grazia
Lodi Stefano
Nanculef Ricardo
Sartori Claudio
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 04/12/2012
Field of study

Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of Core Vector Machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a Minimal Enclosing Ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs and can thus be used for a wider set of problems

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Novel Frank-Wolfe Algorithm. Analysis and Applications to Large-Scale SVM Training

Author: Allende Hector
Frandi Emanuele
Nanculef Ricardo
Sartori Claudio
Publication venue
Publication date: 13/10/2013
Field of study

Recently, there has been a renewed interest in the machine learning community for variants of a sparse greedy approximation procedure for concave optimization known as {the Frank-Wolfe (FW) method}. In particular, this procedure has been successfully applied to train large-scale instances of non-linear Support Vector Machines (SVMs). Specializing FW to SVM training has allowed to obtain efficient algorithms but also important theoretical results, including convergence analysis of training algorithms and new characterizations of model sparsity. In this paper, we present and analyze a novel variant of the FW method based on a new way to perform away steps, a classic strategy used to accelerate the convergence of the basic FW procedure. Our formulation and analysis is focused on a general concave maximization problem on the simplex. However, the specialization of our algorithm to quadratic forms is strongly related to some classic methods in computational geometry, namely the Gilbert and MDM algorithms. On the theoretical side, we demonstrate that the method matches the guarantees in terms of convergence rate and number of iterations obtained by using classic away steps. In particular, the method enjoys a linear rate of convergence, a result that has been recently proved for MDM on quadratic forms. On the practical side, we provide experiments on several classification datasets, and evaluate the results using statistical tests. Experiments show that our method is faster than the FW method with classic away steps, and works well even in the cases in which classic away steps slow down the algorithm. Furthermore, these improvements are obtained without sacrificing the predictive accuracy of the obtained SVM model.Comment: REVISED VERSION (October 2013) -- Title and abstract have been revised. Section 5 was added. Some proofs have been summarized (full-length proofs available in the previous version

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Exploring the symbiotic pangenome of the nitrogen-fixing bacterium Sinorhizobium meliloti

Author: A Alexeyenko
A Becker
A Frandi
AI Jacob
Alessio Mengoni
Alla Lapidus
AM MacLean
Antonella Fioravanti
AV Patankar
B Brito
B Ewing
B Ewing
C Amadou
C Bobik
C Camacho
C Mao
C Sohlenkamp
Chris Detter
Cliff Han
CS Han
D Capela
D Fischer
D Gordon
D Hyatt
D Lynch
D Medini
D Medini
David Bruce
DH Keating
DR Zerbino
E Giuntini
E Kiss
E Kiss
E Luyten
EG Biondi
EJ Chen
EM Zdobnov
Emanuele G Biondi
EV Koonin
F Bottacini
F Debelle
F Galibert
Francesco Pini
GED Oldroyd
H Guo
H Tettelin
Hajnalka Daligault
Hazuki Teshima
IK Jordan
IM López-Lara
J Loh
JA Downie
Jan-Fang Cheng
JE Beringer
JS Schwedock
K Rutherford
KE Gibson
L Ferrieres
Loren Hauser
Lynne Goodwin
M Brilli
M Carelli
M Cren
M Krehenbrink
M Margulies
M Pigliucci
M Remm
M Stiens
Marco Bazzicalupo
Marco Galardini
Matteo Brilli
Miriam Land
MJ Sadowsky
MM Babu
Natalia Ivanova
Natalia Mikhailova
PR Gill
R Durbin
R Tatusov
RA Cedergren
RF Fisher
Roxanne Tapia
S Bennett
Samuel Pitluck
SD McGinnis
Stefano Mocali
Susan Lucas
Tanja Woyke
TC Chao
TL Bailey
W Ma
Ø Hammer
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background <it>Sinorhizobium meliloti </it>is a model system for the studies of symbiotic nitrogen fixation. An extensive polymorphism at the genetic and phenotypic level is present in natural populations of this species, especially in relation with symbiotic promotion of plant growth. AK83 and BL225C are two nodule-isolated strains with diverse symbiotic phenotypes; BL225C is more efficient in promoting growth of the <it>Medicago sativa </it>plants than strain AK83. In order to investigate the genetic determinants of the phenotypic diversification of <it>S. meliloti </it>strains AK83 and BL225C, we sequenced the complete genomes for these two strains. Results With sizes of 7.14 Mbp and 6.97 Mbp, respectively, the genomes of AK83 and BL225C are larger than the laboratory strain Rm1021. The core genome of Rm1021, AK83, BL225C strains included 5124 orthologous groups, while the accessory genome was composed by 2700 orthologous groups. While Rm1021 and BL225C have only three replicons (Chromosome, pSymA and pSymB), AK83 has also two plasmids, 260 and 70 Kbp long. We found 65 interesting orthologous groups of genes that were present only in the accessory genome, consequently responsible for phenotypic diversity and putatively involved in plant-bacterium interaction. Notably, the symbiosis inefficient AK83 lacked several genes required for microaerophilic growth inside nodules, while several genes for accessory functions related to competition, plant invasion and bacteroid tropism were identified only in AK83 and BL225C strains. Presence and extent of polymorphism in regulons of transcription factors involved in symbiotic interaction were also analyzed. Our results indicate that regulons are flexible, with a large number of accessory genes, suggesting that regulons polymorphism could also be a key determinant in the variability of symbiotic performances among the analyzed strains. Conclusions In conclusions, the extended comparative genomics approach revealed a variable subset of genes and regulons that may contribute to the symbiotic diversity.</p

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

Florence Research

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

Hal-Diderot

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Author: AE Hoerl
B Efron
B Schölkopf
Claudio Sartori
Emanuele Frandi
F Pedregosa
H Zou
H Zou
J Friedman
J Friedman
J Friedman
JA Tropp
Johan A. K. Suykens
K Clarkson
M Frank
M Jaggi
M Lee
P Richtárik
Q Zhou
R Tibshirani
R Tibshirani
R Tibshirani
Ricardo Ñanculef
S Shalev-Shwartz
S Shalev-Shwartz
SJ Kim
Stefano Lodi
T Hastie
Y Nesterov
Z Harchaoui
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Replication Data for: "Fast and Scalable Lasso via Stochastic Frank-Wolfe Methods with a Convergence Guarantee"

Author: Frandi Emanuele
Publication venue: Harvard Dataverse
Publication date
Field of study

Datasets used for the experiments in Section 5 of the paper "Fast and Scalable Lasso via Stochastic Frank-Wolfe Methods with a Convergence Guarantee" (E. Frandi, R. Nanculef, S. Lodi, C. Sartori, J. A. K. Suykens

Harvard Dataverse Network

Coordinate search algorithms in multilevel optimization

Author: Alessandra Papini
Emanuele Frandi
Publication venue
Publication date: 05/03/2020
Field of study

Abstract Many optimization problems of practical interest arise from the discretization of continuous problems. Classical examples can be found in calculus of variations, optimal control and image processing. In recent years a number of strategies have been proposed for the solution of such problems, broadly known as multilevel methods. Inspired by classical multigrid schemes for linear systems, they exploit the possibility of solving the problem on coarser discretization levels to accelerate the computation of a finest-level solution. In this paper, we study the applicability of coordinate search algorithms in a multilevel optimization paradigm. We develop a multilevel derivative-free coordinate search method, where coarse-level objective functions are defined by suitable surrogate models. We employ a recursive v-cycle correction scheme, which exhibits multigrid-like error smoothing properties. On a practical level, the algorithm is implemented in tandem with a full-multilevel initialization. A suitable strategy to manage the coordinate search stepsize on different levels is also proposed, which gives a substantial contribution to the overall speed of the algorithm. Numerical experiments on several examples show promising results. The presented algorithm can solve large problems in a reasonable time, thus overcoming size and convergence speed limitations typical of coordinate search methods

CiteSeerX

A PARTAN-Accelerated Frank-Wolfe Algorithm for Large-Scale SVM Classification

Author: Frandi Emanuele
Nanculef R
Suykens Johan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

© 2015 IEEE. Frank-Wolfe algorithms have recently regained the attention of the Machine Learning community. Their solid theoretical properties and sparsity guarantees make them a suitable choice for a wide range of problems in this field. In addition, several variants of the basic procedure exist that improve its theoretical properties and practical performance. In this paper, we investigate the application of some of these techniques to Machine Learning, focusing in particular on a Parallel Tangent (PARTAN) variant of the FW algorithm for SVM classification, which has not been previously suggested or studied for this type of problem. We provide experiments both in a standard setting and using a stochastic speed-up technique, showing that the considered algorithms obtain promising results on several medium and large-scale benchmark datasets.status: publishe

Lirias